Novamente: An Integrative Architecture for General Intelligence
نویسندگان
چکیده
The Novamente AI Engine is briefly reviewed. The overall architecture is unique, drawing on system-theoretic ideas regarding complex mental dynamics and associated emergent patterns. We describe how these are facilitated by a novel knowledge representation which allows diverse cognitive processes to interact effectively. We then elaborate the two primary cognitive algorithms used to construct these processes: probabilistic term logic (PTL), and the Bayesian Optimization Algorithm (BOA). PTL is a highly flexible inference framework, applicable to domains involving uncertain, dynamic data, and autonomous agents in complex environments. BOA is a population-based optimization algorithm which can incorporate prior knowledge. While originally designed to operate on bit strings, our extended version also learns programs and predicates with variable length and tree-like structure, used to represent actions, perceptions, and internal state. We detail some of the specific dynamics and structures we expect to emerge through the interaction of the cognitive processes, outline our approach to training the system through experiential interactive learning, and conclude with a description of some recent results obtained with our partial implementation, including practical work in bioinformatics, natural language processing, and knowledge discovery. Introduction and Motivation The primary motivation behind the Novamente AI Engine is to build a system that can achieve complex goals in complex environments, a synopsis of the definition of intelligence given in (Goertzel 1993). The emphasis is on the plurality of goals and environments. A chess-playing program is not a general intelligence, nor is a data mining engine, nor is a program that can cleverly manipulate a researcher-constructed microworld. A general intelligence must be able to carry out a variety of different tasks in a variety of different contexts, generalizing knowledge between contexts and building up a context and task independent pragmatic understanding of itself and the world. First among the tenets underlying the design is an understanding of mind as the interpenetration of a physical system with an abstract set of patterns, where a pattern is quantified in terms of algorithmic information theory (Goertzel 1997, Chaitin 1987). In essence, a pattern in an entity is an abstract program that is smaller than the entity, and can rapidly compute the entity or its approximation. For instance, a pattern in a drawing of a sine curve might be a program that can compute the curve from a formula. The understanding of mind as pattern ties in naturally with the interpretation of intelligence, at the most abstract level, as a problem of finding compact programs that encapsulate patterns in the environment, in the system itself, and in behavior. This concept was first seriously elaborated in Solomonoff’s work on the theory of induction (Solomonoff 1964, Solomonoff 1978), and has been developed more rigorously and completely in Hutter’s recent work (Hutter 2000), which integrates a body of prior work on algorithmic information theory and statistical decision theory to formalize the concept of general intelligence. Novamente can be proven to be arbitrarily intelligent according to Hutter’s definition, if given sufficient computational resources. Baum has expounded the cognitive science implications of this perspective on intelligence (Baum 2004). Another tenet of the Novamente approach is the realization that intelligence most naturally emerges through situated and social experience. Abstract thoughts and representations are facilitated through the recognition and manipulation of patterns in environments with which a system has sensorimotor interaction; see for example (Boroditsky and Ramscar 2002). This interaction, embodied in the right cognitive architecture, leads to autonomy, experiential interactive learning, and goal-oriented self-modification; a mind continually adapts based on what it learns from its environment and the entities it interacts with. The final tenet is a view of the internal organization of a mind as a collection of semi-autonomous agents embedded in a common substrate (Goertzel 1993). In this vein, Novamente is less extreme than some alternative approaches such as Minsky’s Society of Mind (Minsky 1986), where agents are largely independent. Novamente is based on the idea that minds are self-organizing systems of agents, which interact with some degree of individual freedom, but are also constrained by an overall architecture involving a degree of inbuilt executive control, which nudges the selforganizing dynamics towards emergent hierarchical organization. These abstract principles are coherently unified in a philosophy of cognition called the psynet model (Goertzel, 1997), which is foundational to Novamente, and provides a moderately detailed theory of the emergent structures and dynamics in intelligent systems. In the model, mental functions such as perception, action, reasoning and procedure learning are described in terms of interactions between agents. Any mind, at a given interval of time, is assumed to have a particular goal system, which may be expressed explicitly and/or implicitly. Thus, the dynamics of a cognitive system are understood to be governed by two main forces: self-organization and goal-oriented behavior. More specifically, several primary dynamical principles are posited, including: Association. Patterns, when given attention, spread some of this attention to other patterns that they have previously been associated with in some way. Furthermore, there is Peirce’s “law of mind” (Peirce 1892), which could be paraphrased in modern terms as stating that the mind is an associative memory network, whose dynamics dictate that every idea in the memory is an active agent, continually acting on those ideas with which the memory associates it. Differential attention allocation. Patterns that have been valuable for goal-achievement are given more attention, and are encouraged to participate in giving rise to new patterns. Pattern creation. Patterns that have been valuable for goal-achievement are mutated and combined with each other to yield new patterns. Credit Assignment. Habitual patterns in the system that are found valuable for goal-achievement are explicitly reinforced and made more habitual. Furthermore, the network of patterns in the system must give rise to the following large-scale emergent structures: Hierarchical network. Patterns are habitually in relations of control over other patterns that represent more specialized aspects of themselves. Heterarchical network. The system retains a memory of which patterns have previously been associated with each other in any way. Dual network. Hierarchical and heterarchical structures are combined, with the dynamics of the two structures working together harmoniously. Self structure. A portion of the network of patterns forms into an approximate (fractal) image of the overall network of patterns. Structures and Algorithms The psynet model does not tell you how to build a mind, only, in general terms, what a mind should be like. It would be possible to create many different AI designs based loosely on the psynet model; one example of this is the Webmind AI Engine developed in the late 1990’s (Goertzel et al. 2000, Goertzel 2002). Novamente, as a specific system inspired by the psynet model, owes many of its details to the limitations imposed by contemporary hardware performance and software design methodologies. Furthermore, Novamente is intended to utilize a minimal number of different knowledge representation structures and cognitive algorithms. Regarding knowledge representation, we have chosen an intermediate-level atom network representation which somewhat resembles classic semantic networks but has dynamic aspects that are more similar to neural networks. This enables a breadth of cognitive dynamics, but in a way that utilizes drastically less memory and processing than a more low-level, neural network style approach. The details of the representation have been designed for compatibility with the system’s cognitive algorithms. Regarding cognition, we have reduced the set of fundamental algorithms to two: Probabilistic Term Logic (PTL) and the Bayesian Optimization Algorithm (BOA). The former deals with the local creation of pieces of new knowledge from existing pieces of knowledge; the latter is more oriented towards global optimization, and creates new knowledge by integrating large amounts of existing knowledge. These two algorithms themselves interact in several ways, representing the necessary interdependence of local and holistic cognition. Having reduced the basic knowledge representations and cognitive algorithms to this minimal core, the diverse functional specializations required for pragmatic general intelligence are provided by the introduction of a number of node and link types in the atom network, and a high-level architecture consisting of a number of functionally specialized lobes each deploying the same structures and algorithms for particular purposes (see Figure 1 below). Knowledge Representation Knowledge representation in Novamente involves two levels, the explicit and the emergent. This section focuses on the explicit level; the emergent level involves selforganizing structures called maps, and will be discussed later, after the fundamental cognitive dynamics have been introduced. Explicit knowledge representation in Novamente involves discrete units (atoms) of several types: nodes, links, and containers, which are ordered or unordered collections of atoms. Each atom is associated with a truth value, indicating, roughly, the degree to which it correctly describes the world. Novamente has been designed with several different types of truth values in mind; the simplest of these consists of a pair of value denoting probability and weight of evidence. All atoms also have an associated attention value, indicating how much computational effort should be expended on them. These contain two values, specifying short and long term importance levels. Novamente node types include tokens which derive their meaning via interrelationships with other nodes, nodes representing perceptual inputs into the system (e.g., pixels, points in time, etc.), nodes representing moments and intervals of time, and procedures (described below). Links represent relationships between atoms, such as fuzzy set membership, probabilistic logical relationships, implication, hypotheticality, and context. The particular types and subtypes used, and the justifications for their inclusion, are omitted for brevity. Procedures in Novamente are objects which produce an output, possibly based on a sequence of atoms as input. They may contain generalized combinator trees. These are computer programs written in sloppy combinatory logic, a language that we have developed specifically to meet the needs of tightly integrated inference and learning. Combinatory logic (CL) is a simple yet Turing-complete computational system (Curry and Feys 1958). The basic units are combinators, which are higher order functions that always produce new higher order functions when applied. Beyond combinators, our language contains numbers, arithmetic operators, looping constructs, and conditionals. Procedures may also contain embedded references to other procedures. Two unique features of this language that are advantageous for our purposes are that programs can be expressed as binary trees where the program elements are contained in the leaves, and that variables are not necessary (though they may be introduced where useful). Furthermore, we have generalized the evaluation system of combinatory logic so there are no type restrictions on programs, allowing them to be easily modified and evolved by Novamente’s cognitive processes (hence sloppy). A full exposition of the language is omitted for brevity; see (Looks, Goertzel, and Pennachin 2004). Schemata and predicates are procedures that output atoms and truth values, respectively. Special-purpose predicates, instead of containing combinator trees, represent specific queries that report to the Novamente system some fact about its own state. Predicates may also be designated as goal nodes, in which case the system orients towards making them true. Cognitive Algorithms Novamente cognitive processes make use of two main algorithms, Probabilistic Term Logic (PTL), and the Bayesian Optimization Algorithm (BOA), described below. Probabilistic Term Logic PTL is a highly flexible inference framework, applicable to many different situations, including inference involving uncertain, dynamic data and/or data of mixed type, and inference involving autonomous agents in complex environments. It was designed specifically for use in Novamente, yet also has applicability beyond the Novamente framework; see (Goertzel et al. 2004) for a full exposition. The goals motivating the development of PTL were the desire to have an inference system that: • Operates consistently with probability theory, when deployed within any local context (which may be adaptively identified). • Deals rapidly (but not always perfectly accurately) with large quantities of data, yet allows arbitrarily accurate and careful reasoning on smaller amounts of information, when this is deemed appropriate. Figure 1. Each component is a Lobe, which contains multiple atom types and mind agents. Lobes may span multiple machines, and are controlled by schemata which may be adapted/replaced by new ones learned by Schema Learning, as decided by the Schema Learning Controller. The diagram shows a configuration with a single interaction channel, that contains sensors, actuators and linguistic input; real deployments may contain multiple channels, with different properties. • Deals well with the fact that different beliefs and ideas are bolstered by different amounts of evidence (for an explanation of how traditional probabilistic models fail here, see Wang 1993). • Enables robust, flexible analogical inference between different domains of knowledge (Indurkhya 1992) • Does not require a globally consistent probability model of the world, but is able to create locally consistent models of local contexts, and maintain a dynamically-almostconsistent overall world-model, dealing gracefully with inconsistencies as they occur. • Encompasses both abstract, precise mathematical reasoning and more speculative hypothetical, inductive, and/or analogical reasoning. • Encompasses the inference of both declarative and procedural knowledge. • Deals with inconsistent initial premises by dynamically iterating into a condition of "reasonable almostconsistency and reasonable compatibility with the premises", thus, for example, perceiving sensory reality in a way compatible with conceptual understanding, in the manner portrayed by Gestalt psychology (Kohler, 1993) and developed in the contemporary neural network literature, see (Haikonen 2003). • Makes most humanly simple inferences appear brief, compact and simple. For a sustained argument that term logic exceeds predicate logic in this regard, see (Sommers and Englebretsen, 2000). One difference between PTL and standard probabilistic frameworks is that PTL deals with multivariable truth values. Its minimal truth value object has two components: strength and weight of evidence. Alternately, it can use probability distributions (or discrete approximations thereof) as truth values. This, along with the fact that PTL PTL does not assume that all probabilities are estimated from the same sample space, makes a large difference in the handling of various realistic inference situations. Another difference is PTL’s awareness of context. The context used by PTL can be universal (everything the system has ever seen), local (only the information directly involved in a given inference), or many levels between. This provides a way of toggling between more rigorous and more speculative inference, and also a way of making inference consistent within a given context even when a system’s overall knowledge base is not entirely consistent. First-order PTL deals with probabilistic inference on (asymmetric) inheritance and (symmetric) similarity relationships, where different Novamente link types are used to represent intensional versus extensional relationships (Wang, 1995). The inference rules here are deduction (A B, B C |A C), inversion (Bayes rule), similarityto-inheritance-conversion, and revision (which merges different estimates of the truth value of the same atom). Each inference rules comes with its own quantitative truth value formula, derived using probability theory and related considerations. Analogical inference is obtained as a combination of deductive and inversive inference, and via their effects on map dynamics (described later). Higher-order PTL deals with inference on links that point to links rather than nodes, and on predicates and schemata. The truth value functions here are the same as in first-order PTL, but the interpretations of the functions are different. Variable-free inference using combinators and inference using explicit variables and quantifiers are both supported; the two styles may be freely intermingled. The Bayesian Optimization Algorithm BOA is a population-based optimization algorithm that works on bit strings. BOA significantly outperforms the genetic algorithm on a range of simple optimization problems by maintaining a centralized probabilistic model of the population it is evolving (Pelikan, Goldberg, and Cantú-Paz 1999; Pelikan 2002). When a candidate population has been evaluated for fitness, BOA seeks to uncover dependencies between the variables that characterize “good” candidate solutions (e.g., the correct value for position 5 in the genome depends on the value at position 7), and adds them to its model. In this way, it is hoped that BOA will explicitly discover and utilize probabilistic “building blocks”, which are then used to generate new candidate solutions to populate the next generation. The basic algorithm is as follows (adapted from Pelikan 2002): (1) Generate a random initial population P(0). (2) Use the best instances in P(t) to learn a model M(t). (3) Generate a new set of instances O(t) from M(t). (4) Create P(t+1) by merging O(t) and P(t) according to some criteria. (5) Iterate steps (2) through (4) until termination criteria are satisfied. This algorithm tends to preserve good collections of variable assignments throughout the evolution, and can explore new areas of the search space in a more directed and focused way than GA/GP, while retaining the positive traits of a population-based optimization algorithm (diversity of candidate solutions and non-local search). For Novamente, we have extended BOA to evolve programs, written in our sloppy combinatory logic representation, rather than bit strings (Looks, Goertzel, and Pennachin 2004). Previous work by Ocenasek (Ocenasek 2002) has extended binary BOA to fixed-length strings with non-binary discrete and continuous variables. There is a fundamental difference between learning fixed length strings and programs. In the former, one is evolving individuals with a fixed level of complexity and functionality. In order to evolve non-trivial program trees however, one must rely on incremental progress; a simpler program accretes complexity over time until it is correct. While BOA as described above is effective at optimizing and preserving small components, it does not innately lead to the addition of new components. In order to remedy this, we have added probabilistic variables to the instance generation process that, when activated, have effects similar to crossover in genetic programming; see (Looks, Goertzel, and Pennachin 2004) for details and examples. We have also begun using BOA to discover surprising patterns in large bodies of data, using an approach we call pattern mining. In this application, BOA is given a number of predefined predicates, such as isTall(X), loves(X,Y), isMale(X), etc. Patterns are logical combinations of predicates, e.g., isTall(X) AND isMale(X). Based on the overall philosophy behind Novamente, patterns are evaluated based on “interestingness” which is composed of two factors: pattern-intensity, and novelty relative to the system’s current knowledge base. Pattern-intensity refers to how well a pattern compresses regularities in the system’s knowledge; this can be quantified as the difference between the actual frequency of the expression, and the frequency that would be computed assuming probabilistic independence. In a domain consisting of random men and women, the example given above would be intense, because tallness correlates with maleness. If this correlation were not known to the system nor easily derivable by the system, then it would be significant and acceptably novel, thus being considered “interesting”. This pattern mining approach might run into scalability issues, as the predicate space tends to be very large. We can resolve this problem by encoding the fact that varying degrees of similarity exist between predicates. When similarity is meaningfully quantified, an important cognitive mechanism used in creative thought called slippage comes into play, where ideas are transformed by substituting one concept for another in an intelligent, context-dependant fashion (Hofstadter 1986). We incorporate prior similarity information by embedding predicates in real spaces, so that similar predicates are embedded close to each other. When a refinement of this approach is used, BOA can construct new procedures that make use of existing ones, leading to hierarchical design. A number of factors make our modified BOA variant advantageous for use within Novamente. The centralized probabilistic model used can be constructed with the aid of PTL inference and prior knowledge, allowing them to be used in instance generation. Contrariwise, fitness evaluation of instances generated by BOA with a particular model can be used to revise existing knowledge in the system, and infer new knowledge. As with most evolutionary techniques, BOA can also be used to perform a number of learning tasks inside Novamente, such as categorization, unsupervised clustering of atoms, and function learning, by using appropriate fitness functions.
منابع مشابه
The Novamente Artificial Intelligence Engine
The Novamente AI Engine, a novel AI software system, is briefly reviewed. Novamente is an integrative artificial general intelligence design, which integrates aspects of many prior AI projects and paradigms, including symbolic, probabilistic, evolutionary programming and reinforcement learning approaches; but its overall architecture is unique, drawing on system-theoretic ideas regarding comple...
متن کاملProbabilistic Logic Based Reinforcement Learning of Simple Embodied Behaviors in a 3D Simulation World
Logic-based AI is often thought of as being restricted to highly abstract domains such as theorem-proving and linguistic semantics. In the Novamente AGI architecture, however, probabilistic logic is used for a wider variety of purposes, including simple reinforcement learning of infantile behaviors, which are primarily concerned with perception and action rather than abstract cognition. This pa...
متن کاملAn integrative architecture for general intelligence and executive function revealed by lesion mapping.
Although cognitive neuroscience has made remarkable progress in understanding the involvement of the prefrontal cortex in executive control, the broader functional networks that support high-level cognition and give rise to general intelligence remain to be well characterized. Here, we investigated the neural substrates of the general factor of intelligence (g) and executive function in 182 pat...
متن کاملPerception Processing for General Intelligence: Bridging the Symbolic/Subsymbolic Gap
Bridging the gap between symbolic and subsymbolic representations is a – perhaps the – key obstacle along the path from the present state of AI achievement to human-level artificial general intelligence. One approach to bridging this gap is hybridization – for instance, incorporation of a subsymbolic system and a symbolic system into a integrative cognitive architecture. Here we present a detai...
متن کاملVirtual Easter Egg Hunting: A Thought-Experiment in Embodied Social Learning, Cognitive Process Integration, and the Dynamic Emergence of the Self
The Novamente Cognition Engine (NCE) architecture for Artificial General Intelligence is briefly reviewed, with a focus on exploring how the various cognitive processes involved in the architecture are intended to cooperate in carrying out moderately complex tasks involving controlling an agent embodied in the AGI-Sim 3D simulation world. A handful of previous conference papers have reviewed th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004